In the fight against COVID-19, there are many obstacles to analyzing and exploring health data from multiple healthcare providers. To this end, we have been working with various regional healthcare provider organizations to identify shared common data elements, which can be used to harmonize and manage data sharing within our Chicagoland Pandemic Response Commons. We established a Statistical Summary Reports (SSR) node in our data model to store a range of de-identified statistical reports across multiple partnering healthcare providers. Our goal is to accelerate rapid exchanges of health data among healthcare facilities, researchers and response authorities during public health emergencies. For more information, please visit the Chicagoland Pandemic Response Commons.
This notebook provides an example on how to visualize the status of COVID-19 and general patients information at multiple regional healthcare providers using the demo data from 03/10/2020 to 11/10/2020. The Chicagoland Pandemic Response Commons disclaims responsibility concerning the data’s accuracy, reliability, completeness, timeliness, or usefulness.
Uncomment the lines to install libraries you need. Import required modules:
# !pip install plotly==4.7.1
# !pip install gen3
import requests, json, fnmatch, os, os.path, sys, subprocess, glob, ntpath, copy
import pandas as pd
import numpy as np
from pandas.io.json import json_normalize
from collections import Counter
import gen3
from gen3.auth import Gen3Auth
from gen3.submission import Gen3Submission
from gen3.file import Gen3File
import plotly.graph_objects as go
import plotly.express as px
import warnings
warnings.simplefilter('ignore')
api = "https://chicagoland.pandemicresponsecommons.org"
creds = "credentials.json"
auth = Gen3Auth(api, refresh_file=creds)
sub = Gen3Submission(api, auth)
file = Gen3File(api, auth)
Export the metadata that is stored under a specific node and project using the SDK function export_node
statistical_summary = sub.export_node(
"controlled",
"SSR",
"statistical_summary_report",
"tsv",
"./statistical_summary_report.tsv",
)
statistical_summary = pd.read_csv("./statistical_summary_report.tsv", sep="\t")
Please note that small cell counts (less than 5 data points) for any property are not submitted to Gen3 and instead will be entered as null values to protect patient privacy.
statistical_summary["date"] = (
statistical_summary["submitter_id"].str.split("_").str.get(-1)
)
statistical_summary = statistical_summary.sort_values(
by="date", ascending=True
).reset_index()
statistical_summary = statistical_summary[
[
"date",
"num_COVID",
"num_COVID_deaths",
"num_admitted",
"num_outpatient",
"num_asth",
"num_card",
"num_chf",
"num_diab",
"num_icu",
"num_obes",
"num_pneu",
"num_resp",
"num_vent",
]
]
statistical_summary = statistical_summary.rename(columns={"date": "Date"})
Confirmed cases from the demo data from 03/10/2020 to 11/10/2020. The number of COVID deaths is available from 05/28/2020 to 11/10/2020.
# make figure
fig_dict = {"data": [], "layout": {}, "frames": []}
data_dict = {
"x": statistical_summary.Date,
"y": statistical_summary.num_COVID,
"mode": "markers",
"name": "Confirmed",
}
# fill in layout
fig_dict["data"] = go.Scatter(data_dict)
figure = go.Figure(fig_dict)
figure.add_trace(
go.Scatter(
x=statistical_summary.Date,
y=statistical_summary.num_COVID_deaths,
mode="markers",
name="Death",
)
)
figure.update_layout(
hovermode="closest",
title="Number of Confirmed COVID-19 Cases and Deaths",
xaxis_title="Date",
yaxis_title="Number of Confirmed/Deaths",
autosize=True,
width=900,
height=650,
legend_title_text="",
legend=dict(orientation="v", yanchor="top", y=1.02, xanchor="right", x=1),
updatemenus=[
dict(
type="buttons",
direction="left",
buttons=list(
[
dict(
args=[{"yaxis.type": "linear"}],
label="Linear",
method="relayout",
),
dict(args=[{"yaxis.type": "log"}], label="Log", method="relayout"),
]
),
),
],
)
figure.show("notebook")
(Data as of 03/10/2020 to 11/10/2020)
fig = px.scatter(
statistical_summary,
x="Date",
y="num_COVID",
animation_frame="Date",
range_y=[0, 250],
range_x=["2020-03-10", "2020-11-10"],
)
fig.update_traces(
marker=dict(size=12, line=dict(width=1, color="Black")),
selector=dict(mode="markers"),
)
fig.update_layout(
hovermode="x unified",
title="Number of Confirmed COVID-19 Cases",
xaxis_title="Date",
yaxis_title="Number of Confirmed Cases",
autosize=True,
width=900,
height=650,
legend_title_text="",
legend=dict(orientation="v", yanchor="top", y=1.02, xanchor="right", x=1),
)
fig.update_layout(sliders=[dict(transition=dict(duration=5), len=0.91)])
fig.show("notebook")
(Data as of 05/28/2020 to 10/15/2020 as an example. Users can adjust the range of dates by changing range_x.)
df1 = statistical_summary[
[
"Date",
"num_asth",
"num_card",
"num_chf",
"num_diab",
"num_obes",
"num_pneu",
"num_resp",
]
]
df1 = df1.rename(
columns={
"num_asth": "Asthma",
"num_card": "Cardiovascular disease",
"num_chf": "Congestive heart failure",
"num_diab": "Diabetes",
"num_obes": "Obesity",
"num_pneu": "Pneumonia",
"num_resp": "Respiratory conditions",
}
)
df2 = df1.melt(
id_vars=["Date"], var_name="Health condition", value_name="Number of Patients"
)
fig = px.line(df2, x="Date", y="Number of Patients", color="Health condition", range_x=["2020-05-28", "2020-10-15"])
fig.update_layout(
title="Number of Inpatients with Various Preexisting Health Conditions",
hoverlabel=dict(font_size=10),
hovermode="x unified",
updatemenus=[
dict(
type="buttons",
direction="left",
buttons=list(
[
dict(
args=[{"yaxis.type": "linear"}],
label="Linear",
method="relayout",
),
dict(args=[{"yaxis.type": "log"}], label="Log", method="relayout"),
]
),
),
],
)
fig.show("notebook")
The statistical summary reports are collected from various sources without any kind of normalization and may reflect inconsistent submissions. Thus this demo notebook demonstrates how to access and visualize the data for users who may not be familiar with Jupyter Notebooks and interested in trends shown by the SSRs. However, the data themselves should not be regarded as containing useful information. The data is not intended to constitute advice nor is it to be used as a substitute for decision making from a professional. Users should not act based upon the information here without independently verifying and obtaining any necessary professional advice.
This data may be updated at irregular intervals. Users should check for updates regularly and ensure the most current version of the data is being used.